Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 501 |
| Missing cells | 655 |
| Missing cells (%) | 8.7% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 58.8 KiB |
| Average record size in memory | 120.3 B |
Variable types
| NUM | 10 |
|---|---|
| CAT | 4 |
| UNSUPPORTED | 1 |
Reproduction
| Analysis started | 2020-09-12 13:59:06.400442 |
|---|---|
| Analysis finished | 2020-09-12 14:00:05.318149 |
| Duration | 58.92 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
City has a high cardinality: 497 distinct values | High cardinality |
Female Population is highly correlated with Population [2011] | High correlation |
Population [2011] is highly correlated with Female Population | High correlation |
Population [2011] has 6 (1.2%) missing values | Missing |
Popuation [2001] has 501 (100.0%) missing values | Missing |
Median Age has 13 (2.6%) missing values | Missing |
Avg Temp has 14 (2.8%) missing values | Missing |
SWM has 9 (1.8%) missing values | Missing |
Toilets Avl has 22 (4.4%) missing values | Missing |
Water Purity has 19 (3.8%) missing values | Missing |
H Index has 15 (3.0%) missing values | Missing |
Female Population has 15 (3.0%) missing values | Missing |
# of hospitals has 17 (3.4%) missing values | Missing |
Foreign Visitors has 17 (3.4%) missing values | Missing |
City is uniformly distributed | Uniform |
Popuation [2001] is an unsupported type, check if it needs cleaning or further analysis | Unsupported |
| Distinct count | 497 |
|---|---|
| Unique (%) | 99.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 KiB |
| Pratapgarh | 2 |
|---|---|
| Narsinghgarh | 2 |
| Sumerpur | 2 |
| Shahpura | 2 |
| Warhapur | 1 |
| Other values (492) |
| Value | Count | Frequency (%) | |
| Pratapgarh | 2 | 0.4% | |
| Narsinghgarh | 2 | 0.4% | |
| Sumerpur | 2 | 0.4% | |
| Shahpura | 2 | 0.4% | |
| Warhapur | 1 | 0.2% | |
| Doiwala | 1 | 0.2% | |
| Nandaprayag | 1 | 0.2% | |
| Shikaripur | 1 | 0.2% | |
| Pavagada | 1 | 0.2% | |
| Dharchula | 1 | 0.2% | |
| Other values (487) | 487 | 97.2% |
Length
| Max length | 24 |
|---|---|
| Median length | 8 |
| Mean length | 8.546906188 |
| Min length | 3 |
State
Categorical
| Distinct count | 29 |
|---|---|
| Unique (%) | 5.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.9 KiB |
| Uttar Pradesh | |
|---|---|
| Maharashtra | |
| Uttarakhand | |
| Tamil Nadu | |
| Rajasthan | 48 |
| Other values (24) |
| Value | Count | Frequency (%) | |
| Uttar Pradesh | 56 | 11.2% | |
| Maharashtra | 51 | 10.2% | |
| Uttarakhand | 51 | 10.2% | |
| Tamil Nadu | 51 | 10.2% | |
| Rajasthan | 48 | 9.6% | |
| Karnataka | 38 | 7.6% | |
| Madhya Pradesh | 37 | 7.4% | |
| Gujarat | 22 | 4.4% | |
| Bihar | 22 | 4.4% | |
| Kerala | 18 | 3.6% | |
| Other values (19) | 107 | 21.4% |
Length
| Max length | 22 |
|---|---|
| Median length | 10 |
| Mean length | 10.04191617 |
| Min length | 5 |
Type
Categorical
| Distinct count | 32 |
|---|---|
| Unique (%) | 6.4% |
| Missing | 2 |
| Missing (%) | 0.4% |
| Memory size | 3.9 KiB |
| M | |
|---|---|
| N.P | |
| M.Cl | |
| T.P | |
| C.T | |
| Other values (27) |
| Value | Count | Frequency (%) | |
| M | 119 | 23.8% | |
| N.P | 86 | 17.2% | |
| M.Cl | 61 | 12.2% | |
| T.P | 42 | 8.4% | |
| C.T | 41 | 8.2% | |
| T.M.C | 22 | 4.4% | |
| N.P.P | 19 | 3.8% | |
| N.A | 19 | 3.8% | |
| M.B | 16 | 3.2% | |
| UA | 8 | 1.6% | |
| Other values (22) | 66 | 13.2% |
Length
| Max length | 6 |
|---|---|
| Median length | 3 |
| Mean length | 2.932135729 |
| Min length | 1 |
| Distinct count | 486 |
|---|---|
| Unique (%) | 98.2% |
| Missing | 6 |
| Missing (%) | 1.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 24747.468686868688 |
|---|---|
| Minimum | 110.0 |
| Maximum | 36774.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 110 |
|---|---|
| 5-th percentile | 7577.7 |
| Q1 | 21435 |
| median | 25199 |
| Q3 | 30763 |
| 95-th percentile | 35414.3 |
| Maximum | 36774 |
| Range | 36664 |
| Interquartile range (IQR) | 9328 |
Descriptive statistics
| Standard deviation | 7813.0675 |
|---|---|
| Coefficient of variation (CV) | 0.3157117845 |
| Kurtosis | 0.7106377163 |
| Mean | 24747.46869 |
| Median Absolute Deviation (MAD) | 4344 |
| Skewness | -0.9115545052 |
| Sum | 12249997 |
| Variance | 61044023.76 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 23234 | 3 | 0.6% | |
| 23456 | 2 | 0.4% | |
| 27815 | 2 | 0.4% | |
| 22781 | 2 | 0.4% | |
| 22516 | 2 | 0.4% | |
| 21643 | 2 | 0.4% | |
| 6309 | 2 | 0.4% | |
| 23331 | 2 | 0.4% | |
| 36172 | 1 | 0.2% | |
| 26521 | 1 | 0.2% | |
| Other values (476) | 476 | 95.0% | |
| (Missing) | 6 | 1.2% |
| Value | Count | Frequency (%) | |
| 110 | 1 | 0.2% | |
| 612 | 1 | 0.2% | |
| 1517 | 1 | 0.2% | |
| 1641 | 1 | 0.2% | |
| 2152 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 36774 | 1 | 0.2% | |
| 36754 | 1 | 0.2% | |
| 36732 | 1 | 0.2% | |
| 36706 | 1 | 0.2% | |
| 36669 | 1 | 0.2% |
Sex Ratio
Real number (ℝ≥0)
| Distinct count | 145 |
|---|---|
| Unique (%) | 29.2% |
| Missing | 5 |
| Missing (%) | 1.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 895.508064516129 |
|---|---|
| Minimum | 774.0 |
| Maximum | 991.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 774 |
|---|---|
| 5-th percentile | 839.5 |
| Q1 | 867.75 |
| median | 890.5 |
| Q3 | 922 |
| 95-th percentile | 963 |
| Maximum | 991 |
| Range | 217 |
| Interquartile range (IQR) | 54.25 |
Descriptive statistics
| Standard deviation | 38.46415011 |
|---|---|
| Coefficient of variation (CV) | 0.0429523213 |
| Kurtosis | -0.5558492085 |
| Mean | 895.5080645 |
| Median Absolute Deviation (MAD) | 27.5 |
| Skewness | 0.2749158367 |
| Sum | 444172 |
| Variance | 1479.490844 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 869 | 10 | 2.0% | |
| 910 | 10 | 2.0% | |
| 868 | 9 | 1.8% | |
| 875 | 9 | 1.8% | |
| 874 | 8 | 1.6% | |
| 870 | 8 | 1.6% | |
| 919 | 8 | 1.6% | |
| 877 | 7 | 1.4% | |
| 876 | 7 | 1.4% | |
| 863 | 7 | 1.4% | |
| Other values (135) | 413 | 82.4% |
| Value | Count | Frequency (%) | |
| 774 | 1 | 0.2% | |
| 823 | 1 | 0.2% | |
| 827 | 1 | 0.2% | |
| 831 | 5 | 1.0% | |
| 832 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 991 | 1 | 0.2% | |
| 986 | 1 | 0.2% | |
| 985 | 1 | 0.2% | |
| 983 | 1 | 0.2% | |
| 982 | 1 | 0.2% |
| Distinct count | 10 |
|---|---|
| Unique (%) | 2.0% |
| Missing | 13 |
| Missing (%) | 2.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.12090163934426 |
|---|---|
| Minimum | 23.0 |
| Maximum | 32.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 23 |
|---|---|
| 5-th percentile | 23 |
| Q1 | 24 |
| median | 26 |
| Q3 | 28 |
| 95-th percentile | 29 |
| Maximum | 32 |
| Range | 9 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.145558807 |
|---|---|
| Coefficient of variation (CV) | 0.08213953854 |
| Kurtosis | -0.8697617891 |
| Mean | 26.12090164 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.2156424531 |
| Sum | 12747 |
| Variance | 4.603422594 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 24 | 75 | 15.0% | |
| 29 | 74 | 14.8% | |
| 25 | 71 | 14.2% | |
| 26 | 70 | 14.0% | |
| 28 | 68 | 13.6% | |
| 23 | 64 | 12.8% | |
| 27 | 54 | 10.8% | |
| 30 | 5 | 1.0% | |
| 32 | 5 | 1.0% | |
| 31 | 2 | 0.4% | |
| (Missing) | 13 | 2.6% |
| Value | Count | Frequency (%) | |
| 23 | 64 | 12.8% | |
| 24 | 75 | 15.0% | |
| 25 | 71 | 14.2% | |
| 26 | 70 | 14.0% | |
| 27 | 54 | 10.8% |
| Value | Count | Frequency (%) | |
| 32 | 5 | 1.0% | |
| 31 | 2 | 0.4% | |
| 30 | 5 | 1.0% | |
| 29 | 74 | 14.8% | |
| 28 | 68 | 13.6% |
| Distinct count | 27 |
|---|---|
| Unique (%) | 5.5% |
| Missing | 14 |
| Missing (%) | 2.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.100616016427104 |
|---|---|
| Minimum | 5.0 |
| Maximum | 40.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 26 |
| median | 31 |
| Q3 | 36 |
| 95-th percentile | 39 |
| Maximum | 40 |
| Range | 35 |
| Interquartile range (IQR) | 10 |
Descriptive statistics
| Standard deviation | 9.295787556 |
|---|---|
| Coefficient of variation (CV) | 0.3194361092 |
| Kurtosis | 0.3458727314 |
| Mean | 29.10061602 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | -1.139441722 |
| Sum | 14172 |
| Variance | 86.41166629 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 35 | 38 | 7.6% | |
| 34 | 37 | 7.4% | |
| 38 | 32 | 6.4% | |
| 26 | 30 | 6.0% | |
| 39 | 30 | 6.0% | |
| 25 | 25 | 5.0% | |
| 37 | 25 | 5.0% | |
| 31 | 23 | 4.6% | |
| 28 | 23 | 4.6% | |
| 29 | 22 | 4.4% | |
| Other values (17) | 202 | 40.3% |
| Value | Count | Frequency (%) | |
| 5 | 7 | 1.4% | |
| 6 | 3 | 0.6% | |
| 7 | 5 | 1.0% | |
| 8 | 9 | 1.8% | |
| 9 | 6 | 1.2% |
| Value | Count | Frequency (%) | |
| 40 | 20 | 4.0% | |
| 39 | 30 | 6.0% | |
| 38 | 32 | 6.4% | |
| 37 | 25 | 5.0% | |
| 36 | 19 | 3.8% |
| Distinct count | 3 |
|---|---|
| Unique (%) | 0.6% |
| Missing | 9 |
| Missing (%) | 1.8% |
| Memory size | 3.9 KiB |
| LOW | |
|---|---|
| HIGH | |
| MEDIUM |
| Value | Count | Frequency (%) | |
| LOW | 179 | 35.7% | |
| HIGH | 158 | 31.5% | |
| MEDIUM | 155 | 30.9% | |
| (Missing) | 9 | 1.8% |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.243512974 |
| Min length | 3 |
| Distinct count | 62 |
|---|---|
| Unique (%) | 12.9% |
| Missing | 22 |
| Missing (%) | 4.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 72.2776617954071 |
|---|---|
| Minimum | 10.0 |
| Maximum | 100.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 18.9 |
| Q1 | 61 |
| median | 74 |
| Q3 | 90 |
| 95-th percentile | 98 |
| Maximum | 100 |
| Range | 90 |
| Interquartile range (IQR) | 29 |
Descriptive statistics
| Standard deviation | 20.79900178 |
|---|---|
| Coefficient of variation (CV) | 0.2877652826 |
| Kurtosis | 1.131275088 |
| Mean | 72.2776618 |
| Median Absolute Deviation (MAD) | 15 |
| Skewness | -1.039213286 |
| Sum | 34621 |
| Variance | 432.5984749 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 69 | 17 | 3.4% | |
| 97 | 16 | 3.2% | |
| 92 | 15 | 3.0% | |
| 94 | 15 | 3.0% | |
| 96 | 13 | 2.6% | |
| 71 | 13 | 2.6% | |
| 50 | 12 | 2.4% | |
| 95 | 12 | 2.4% | |
| 80 | 12 | 2.4% | |
| 57 | 12 | 2.4% | |
| Other values (52) | 342 | 68.3% | |
| (Missing) | 22 | 4.4% |
| Value | Count | Frequency (%) | |
| 10 | 2 | 0.4% | |
| 11 | 2 | 0.4% | |
| 12 | 4 | 0.8% | |
| 13 | 1 | 0.2% | |
| 14 | 3 | 0.6% |
| Value | Count | Frequency (%) | |
| 100 | 9 | 1.8% | |
| 99 | 9 | 1.8% | |
| 98 | 7 | 1.4% | |
| 97 | 16 | 3.2% | |
| 96 | 13 | 2.6% |
| Distinct count | 99 |
|---|---|
| Unique (%) | 20.5% |
| Missing | 19 |
| Missing (%) | 3.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 151.35892116182572 |
|---|---|
| Minimum | 100.0 |
| Maximum | 200.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 100 |
|---|---|
| 5-th percentile | 105 |
| Q1 | 127 |
| median | 152 |
| Q3 | 175 |
| 95-th percentile | 194.95 |
| Maximum | 200 |
| Range | 100 |
| Interquartile range (IQR) | 48 |
Descriptive statistics
| Standard deviation | 28.71919055 |
|---|---|
| Coefficient of variation (CV) | 0.1897423048 |
| Kurtosis | -1.168462088 |
| Mean | 151.3589212 |
| Median Absolute Deviation (MAD) | 24.5 |
| Skewness | -0.09407192221 |
| Sum | 72955 |
| Variance | 824.7919057 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 114 | 10 | 2.0% | |
| 136 | 10 | 2.0% | |
| 144 | 10 | 2.0% | |
| 124 | 9 | 1.8% | |
| 110 | 9 | 1.8% | |
| 166 | 9 | 1.8% | |
| 146 | 9 | 1.8% | |
| 178 | 9 | 1.8% | |
| 160 | 9 | 1.8% | |
| 174 | 8 | 1.6% | |
| Other values (89) | 390 | 77.8% | |
| (Missing) | 19 | 3.8% |
| Value | Count | Frequency (%) | |
| 100 | 7 | 1.4% | |
| 101 | 2 | 0.4% | |
| 102 | 2 | 0.4% | |
| 103 | 4 | 0.8% | |
| 104 | 5 | 1.0% |
| Value | Count | Frequency (%) | |
| 200 | 4 | 0.8% | |
| 199 | 4 | 0.8% | |
| 198 | 8 | 1.6% | |
| 197 | 6 | 1.2% | |
| 195 | 3 | 0.6% |
| Distinct count | 486 |
|---|---|
| Unique (%) | 100.0% |
| Missing | 15 |
| Missing (%) | 3.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.501041634853053 |
|---|---|
| Minimum | 0.0009574363037994083 |
| Maximum | 0.9999010902726044 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 0.0009574363038 |
|---|---|
| 5-th percentile | 0.062014457 |
| Q1 | 0.2666187181 |
| median | 0.5082177095 |
| Q3 | 0.737776037 |
| 95-th percentile | 0.944058368 |
| Maximum | 0.9999010903 |
| Range | 0.998943654 |
| Interquartile range (IQR) | 0.4711573189 |
Descriptive statistics
| Standard deviation | 0.2843004523 |
|---|---|
| Coefficient of variation (CV) | 0.5674188182 |
| Kurtosis | -1.138861144 |
| Mean | 0.5010416349 |
| Median Absolute Deviation (MAD) | 0.2379787621 |
| Skewness | 0.004863596619 |
| Sum | 243.5062345 |
| Variance | 0.08082674719 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0.8867904331 | 1 | 0.2% | |
| 0.1746321156 | 1 | 0.2% | |
| 0.3189381151 | 1 | 0.2% | |
| 0.403033013 | 1 | 0.2% | |
| 0.2414900063 | 1 | 0.2% | |
| 0.7442951681 | 1 | 0.2% | |
| 0.08814416623 | 1 | 0.2% | |
| 0.8996637671 | 1 | 0.2% | |
| 0.630238193 | 1 | 0.2% | |
| 0.63898617 | 1 | 0.2% | |
| Other values (476) | 476 | 95.0% | |
| (Missing) | 15 | 3.0% |
| Value | Count | Frequency (%) | |
| 0.0009574363038 | 1 | 0.2% | |
| 0.002001209807 | 1 | 0.2% | |
| 0.003805058303 | 1 | 0.2% | |
| 0.004338165469 | 1 | 0.2% | |
| 0.005412633668 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 0.9999010903 | 1 | 0.2% | |
| 0.997322788 | 1 | 0.2% | |
| 0.9961231447 | 1 | 0.2% | |
| 0.9952600462 | 1 | 0.2% | |
| 0.9919635495 | 1 | 0.2% |
| Distinct count | 482 |
|---|---|
| Unique (%) | 99.2% |
| Missing | 15 |
| Missing (%) | 3.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22542.633744855968 |
|---|---|
| Minimum | 0.0 |
| Maximum | 34523.0 |
| Zeros | 1 |
| Zeros (%) | 0.2% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7698.5 |
| Q1 | 19449.75 |
| median | 22998.5 |
| Q3 | 27701.75 |
| 95-th percentile | 31957.5 |
| Maximum | 34523 |
| Range | 34523 |
| Interquartile range (IQR) | 8252 |
Descriptive statistics
| Standard deviation | 6931.232314 |
|---|---|
| Coefficient of variation (CV) | 0.3074721611 |
| Kurtosis | 1.044417715 |
| Mean | 22542.63374 |
| Median Absolute Deviation (MAD) | 3994.5 |
| Skewness | -0.9312305215 |
| Sum | 10955720 |
| Variance | 48041981.4 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 19004 | 2 | 0.4% | |
| 21857 | 2 | 0.4% | |
| 19316 | 2 | 0.4% | |
| 27362 | 2 | 0.4% | |
| 27600 | 1 | 0.2% | |
| 5382 | 1 | 0.2% | |
| 19529 | 1 | 0.2% | |
| 8878 | 1 | 0.2% | |
| 19336 | 1 | 0.2% | |
| 30221 | 1 | 0.2% | |
| Other values (472) | 472 | 94.2% | |
| (Missing) | 15 | 3.0% |
| Value | Count | Frequency (%) | |
| 0 | 1 | 0.2% | |
| 94 | 1 | 0.2% | |
| 522 | 1 | 0.2% | |
| 1292 | 1 | 0.2% | |
| 1392 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 34523 | 1 | 0.2% | |
| 34360 | 1 | 0.2% | |
| 34328 | 1 | 0.2% | |
| 34237 | 1 | 0.2% | |
| 34114 | 1 | 0.2% |
| Distinct count | 27 |
|---|---|
| Unique (%) | 5.6% |
| Missing | 17 |
| Missing (%) | 3.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 19.173553719008265 |
|---|---|
| Minimum | 3.0 |
| Maximum | 30.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 3 |
|---|---|
| 5-th percentile | 7.15 |
| Q1 | 14 |
| median | 20 |
| Q3 | 25 |
| 95-th percentile | 29 |
| Maximum | 30 |
| Range | 27 |
| Interquartile range (IQR) | 11 |
Descriptive statistics
| Standard deviation | 6.69714897 |
|---|---|
| Coefficient of variation (CV) | 0.3492909592 |
| Kurtosis | -0.6789259198 |
| Mean | 19.17355372 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | -0.3042408893 |
| Sum | 9280 |
| Variance | 44.85180432 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 21 | 33 | 6.6% | |
| 28 | 29 | 5.8% | |
| 12 | 26 | 5.2% | |
| 20 | 25 | 5.0% | |
| 23 | 24 | 4.8% | |
| 19 | 24 | 4.8% | |
| 15 | 23 | 4.6% | |
| 24 | 22 | 4.4% | |
| 11 | 21 | 4.2% | |
| 26 | 21 | 4.2% | |
| Other values (17) | 236 | 47.1% |
| Value | Count | Frequency (%) | |
| 3 | 3 | 0.6% | |
| 4 | 9 | 1.8% | |
| 5 | 6 | 1.2% | |
| 6 | 5 | 1.0% | |
| 7 | 2 | 0.4% |
| Value | Count | Frequency (%) | |
| 30 | 16 | 3.2% | |
| 29 | 19 | 3.8% | |
| 28 | 29 | 5.8% | |
| 27 | 17 | 3.4% | |
| 26 | 21 | 4.2% |
| Distinct count | 28 |
|---|---|
| Unique (%) | 5.8% |
| Missing | 17 |
| Missing (%) | 3.4% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1676300.902892562 |
|---|---|
| Minimum | 798.0 |
| Maximum | 4684707.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.9 KiB |
Quantile statistics
| Minimum | 798 |
|---|---|
| 5-th percentile | 34886 |
| Q1 | 284973 |
| median | 923737 |
| Q3 | 3104060 |
| 95-th percentile | 4684707 |
| Maximum | 4684707 |
| Range | 4683909 |
| Interquartile range (IQR) | 2819087 |
Descriptive statistics
| Standard deviation | 1704860.432 |
|---|---|
| Coefficient of variation (CV) | 1.017037233 |
| Kurtosis | -1.041338164 |
| Mean | 1676300.903 |
| Median Absolute Deviation (MAD) | 755952 |
| Skewness | 0.7625311928 |
| Sum | 811329637 |
| Variance | 2.906549094e+12 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3104060 | 56 | 11.2% | |
| 4684707 | 49 | 9.8% | |
| 4408916 | 49 | 9.8% | |
| 1475311 | 47 | 9.4% | |
| 105882 | 45 | 9.0% | |
| 636502 | 37 | 7.4% | |
| 421365 | 35 | 7.0% | |
| 923737 | 22 | 4.4% | |
| 284973 | 22 | 4.4% | |
| 977479 | 18 | 3.6% | |
| Other values (18) | 104 | 20.8% | |
| (Missing) | 17 | 3.4% |
| Value | Count | Frequency (%) | |
| 798 | 1 | 0.2% | |
| 1797 | 1 | 0.2% | |
| 2769 | 3 | 0.6% | |
| 3260 | 2 | 0.4% | |
| 5705 | 2 | 0.4% |
| Value | Count | Frequency (%) | |
| 4684707 | 49 | 9.8% | |
| 4408916 | 49 | 9.8% | |
| 3104060 | 56 | 11.2% | |
| 1489500 | 14 | 2.8% | |
| 1475311 | 47 | 9.4% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| City | State | Type | Population [2011] | Popuation [2001] | Sex Ratio | Median Age | Avg Temp | SWM | Toilets Avl | Water Purity | H Index | Female Population | # of hospitals | Foreign Visitors | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Tuensang | Nagaland | T.C | 36774.0 | NaN | 931.0 | 23.0 | 10.0 | MEDIUM | 94.0 | 114.0 | 0.253390 | 34237.0 | 17.0 | 2769.0 |
| 1 | Lakshmeshwar | Karnataka | T.M.C | 36754.0 | NaN | 934.0 | 25.0 | 38.0 | HIGH | 62.0 | 160.0 | 0.192555 | 34328.0 | 13.0 | 636502.0 |
| 2 | Zira | Punjab | M.Cl. | 36732.0 | NaN | 883.0 | 29.0 | 35.0 | HIGH | 63.0 | 105.0 | 0.887882 | 32434.0 | 17.0 | 242367.0 |
| 3 | Yawal | Maharashtra | M.Cl | 36706.0 | NaN | 887.0 | 26.0 | 31.0 | HIGH | 60.0 | 174.0 | 0.407838 | 32558.0 | 11.0 | 4408916.0 |
| 4 | Thana Bhawan | Uttar Pradesh | N.P. | 36669.0 | NaN | 877.0 | 28.0 | 39.0 | LOW | 92.0 | 153.0 | 0.324456 | 32159.0 | 23.0 | 3104060.0 |
| 5 | Ramdurg | Karnataka | UA | 36649.0 | NaN | 942.0 | 27.0 | 28.0 | MEDIUM | 92.0 | 185.0 | 0.571883 | 34523.0 | 30.0 | 636502.0 |
| 6 | Pulgaon | Maharashtra | M.Cl | 36522.0 | NaN | 887.0 | 26.0 | 31.0 | MEDIUM | 72.0 | 108.0 | 0.271195 | 32395.0 | 11.0 | 4408916.0 |
| 7 | Sadasivpet | Telangana | M | 36334.0 | NaN | 921.0 | 27.0 | 40.0 | LOW | 70.0 | 116.0 | 0.494227 | 33464.0 | 17.0 | 126078.0 |
| 8 | Nargund | Karnataka | T.M.C | 36291.0 | NaN | 940.0 | 23.0 | 37.0 | LOW | 77.0 | 148.0 | 0.708562 | 34114.0 | 21.0 | 636502.0 |
| 9 | Neem-Ka-Thana | Rajasthan | M | 36231.0 | NaN | 850.0 | 25.0 | 25.0 | MEDIUM | 61.0 | 148.0 | 0.592325 | 30796.0 | 29.0 | 1475311.0 |
Last rows
| City | State | Type | Population [2011] | Popuation [2001] | Sex Ratio | Median Age | Avg Temp | SWM | Toilets Avl | Water Purity | H Index | Female Population | # of hospitals | Foreign Visitors | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 491 | Bhaiseena | Rajasthan | G.P | 3200.0 | NaN | 869.0 | 24.0 | 34.0 | LOW | 17.0 | 167.0 | 0.092957 | 2781.0 | 4.0 | 1475311.0 |
| 492 | Dwarahat | Uttarakhand | N.P | 2749.0 | NaN | 836.0 | 25.0 | 12.0 | HIGH | 18.0 | 146.0 | 0.186739 | 2298.0 | 8.0 | 105882.0 |
| 493 | Badrinath | Uttarakhand | N.P | 2438.0 | NaN | 848.0 | 29.0 | 12.0 | LOW | 19.0 | 190.0 | 0.432991 | 2067.0 | 4.0 | 105882.0 |
| 494 | Dogadda | Uttarakhand | N.P.P | 2422.0 | NaN | 840.0 | 26.0 | 11.0 | HIGH | 11.0 | 146.0 | 0.030421 | 2034.0 | 4.0 | 105882.0 |
| 495 | Devprayag | Uttarakhand | N.P | 2152.0 | NaN | 840.0 | 29.0 | 7.0 | MEDIUM | 14.0 | 124.0 | 0.503070 | 1808.0 | 8.0 | 105882.0 |
| 496 | Nandaprayag | Uttarakhand | N.P | 1641.0 | NaN | 848.0 | 27.0 | 7.0 | MEDIUM | 12.0 | 181.0 | 0.316926 | 1392.0 | 4.0 | 105882.0 |
| 497 | Kirtinagar | Uttarakhand | N.P | 1517.0 | NaN | 852.0 | 28.0 | 12.0 | HIGH | 16.0 | 198.0 | 0.336852 | 1292.0 | 6.0 | 105882.0 |
| 498 | Kedarnath | Uttarakhand | N.P | 612.0 | NaN | 853.0 | 24.0 | 9.0 | LOW | 19.0 | 189.0 | 0.723253 | 522.0 | 6.0 | 105882.0 |
| 499 | Gangotri | Uttarakhand | N.P | 110.0 | NaN | 852.0 | 27.0 | 8.0 | MEDIUM | 18.0 | 170.0 | 0.421061 | 94.0 | 8.0 | 105882.0 |
| 500 | Kumarganj | Uttar Pradesh | C.T | NaN | NaN | 863.0 | 24.0 | 35.0 | HIGH | 19.0 | 149.0 | 0.154375 | 0.0 | 6.0 | 3104060.0 |